|
Poster:
|
aibek |
Date:
|
December 31, 2012 06:32:13am |
|
Forum:
|
texts
|
Subject:
|
Re: PDF's on Amazon Kindle |
First about the file.
Every image that we see in this is made up from 3 images: a background, a foreground and a mask. So the pdf really contains not 116, but 348 images. The foreground and background are in RGB, the mask is BW (1 bit per sample).
To see why this is useful, imagine brown text on yellow page. If such a 1000x1000 sample image is saved in RGB (3 bytes per sample), it would before compression take 3,000,000 bytes. In the IA method the image is decomposed into 3 parts: a brown 1000x1000 image, a yellow 1000x1000 image, and a BW 1000x1000 mask, and each of them is compressed and saved separately in the PDF. It is the reader’s work to join them together. '0' in the mask image means the reader is to compose the final image by using the corresponding pixel of the foreground image; '1' means that the reader is to use the corresponding pixel of the background image.
This helps as the three can be compressed much better. At an extreme, you can always imagine the brown and the yellow images taking just a few bytes each: 3 bytes for recording the colour, and a few bytes more for recording the dimensions of the image. Also, our most critical data is in the mask -- that image has to be the sharpest. But that has now 1 bit samples, and not the 27 bytes we earlier had.
There are a few more details. First, the background image is saved at a lower resolution, and the reader is asked to interpolate. So a little, if insignificant, loss in quality is creeping in the IA "compression". Second, the background and the foreground images are not limited to one colour -- they can be full-fledged RGB images too. The point is that the background image will fill surrounding colour in the place where the (foreground) objects are -- those parts will not be used anyway -- and thus will have a more or less uniform colour or gradient. And similarly, the foreground image will fill the nearby colours in the place in which it will not be read. This way the compression is much better. The critical stuff is (i) identifying the foreground and background properly, (ii) filling colours in areas which will not be read so that the image can be compressed best by the compression method of choice ('JPXDecode' for the RGB images in IA files). IA’s PDF files are produced by 'LuraDocument PDF v2.28'.
The compression is pretty significant. In the attached page 2 of the
Devises et Emblemes file, the three images together take 42 KB. The dimensions of the composed image, however, are 2201x3063, so RGB (3 bytes per sample) would take 19
MB! This is the size of the file you will get when you join the three together in the intended manner.
So, most likely, Kindle is refusing to do this work. Please check for the text-layer issue too. The attached
orig-p.2.pdf is the p. 2 of the book. The
minus-textlayer.pdf is that page minus the text layer. (Both the files are of 40 KB size.) I am assuming that your Kindle can read neither of the files.